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Abstract. We consider a variation of the prototype combinatorial-optimisation 
problem known as graph-colouring. Our optimisation goal is to colour the vertices 
of a graph with a fixed number of colours, in a way to maximise the number of 
different colours present in the set of nearest neighbours of each given vertex. This 
problem, which we pictorially call palette- colouring, has been recently addressed as 
a basic example of problem arising in the context of distributed data storage. Even 
though it has not been proved to be NP complete, random search algorithms find the 
problem hard to solve. Heuristics based on a naive belief propagation algorithm are 
observed to work quite well in certain conditions. In this paper, we build upon the 
mentioned result, working out the correct belief propagation algorithm, which needs 
to take into account the many-body nature of the constraints present in this problem. 
This method improves the naive belief propagation approach, at the cost of increased 
computational effort. We also investigate the emergence of a satisfiable to unsatisfiable 
"phase transition" as a function of the vertex mean degree, for different ensembles of 
sparse random graphs in the large size ("thermodynamic") limit. 
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1. Introduction 

Graph colouring is a prototype of combinatorial optimisation or constraint satisfaction 
problems [1]. It is NP-complete, so that it can be taken as a benchmark for optimisation 
algorithms. Moreover, it is at the core of a large number of technologically relevant 
combinatorial problems, such as scheduhng. The goal is to assign a colour to each 
vertex of a given graph (with a fixed number of available colours), in such a way that 
no pair of vertices connected by an edge have the same colour. Alternatively, one may 
be satisfied with a suboptimal solution, i.e., minimising the number of vertex pairs with 
the same colour. 

A nice variant of the above problem has been recently proposed and investigated 
by Bounkong and coworkers [21 [3]. The variation consists in requiring that the set of 
colours assigned to each given vertex and its neighbours includes all available colours. 
The latter problem, which we pictorially call palette- colouring, has been suggested as a 
basic example of constraint satisfaction problem arising in the context of distributed data 
storage [2]. The basic idea is as follows. On a computer network with limited storage 
resources at each node, it may be convenient to divide a file into a number of segments 
(colours), which are then distributed over different nodes. Each given node should be 
able to retrieve the different segments by accessing only its own and nearest neighbour 
storage devices, whence the above described constraints. Even in this case, one might 
be satisfied with a suboptimal solution, i.e., maximising the number of colours present 
in each node neighbourhood. We note that palette-colouring has not been proved to be 
NP-complete, but there are numerical evidences that it becomes intractable for large 
system size [2]. With respect to ordinary colouring, the most relevant difference is that 
the modified problem becomes easier to solve for graphs with higher, rather than lower, 
vertex degrees. 

In the last few years, different types of constraint satisfaction problems have been 
faced by message passing techniques, among which Belief Propagation (BP) [ll[5]. BP 
has been originally conceived as a dynamic programming algorithm to perform exact 
statistical inference for Markov random field models defined on graphs without loops 
(trees) [H [7]. Subsequently, it has been demonstrated to be relatively good even for 
loopy graphs. Such a successful behaviour seems to be related to the fact that actually 
BP is equivalent to determine a minimum of an approximate free energy function (Bethe 
free energy) for a corresponding thermodynamic system. The Bethe approximation was 
indeed very well known to physicists [H [9] , but the connection with BP is a relatively 
recent result |4j. 

In [2], Bounkong, van Mourik, and Saad analyse an algorithm based on BP, 
comparing its performance with a variant of Walksat |10]. In particular the BP- 
based algorithm makes use of beliefs averaged over several iterations, together with 
a common decimation strategy. It is observed that, while Walksat works definitely 
better for small graphs (100 vertices), the opposite occurs for larger (1000 vertices) 
random graphs. This result is somehow related to the nature of BP itself, since large 



Palette- colouring: a belief-propagation approach 



3 



random graphs are known to be tree-like, in the sense that the probabihty of finite 
length loops tends to zero as the number of vertices becomes large. Nevertheless, the 
BP algorithm employed in [2] follows a naive scheme, in which every message provides 
a contribution to the probability distribution of a single variable, taking into account 
a given interaction (constraint). Due to the many- variable nature of the constraints 
present in the palette-colouring problem, this scheme no longer provides the exact 
solution even in the case of trees. As suggested in the cited work, the exact solution can 
be determined at the cost of propagating generalised messages providing a contribution 
to the joint probability distributions of pairs of nearest neighbour variables (instead 
of single variable distributions). This scheme has already been used in different works 
dealing with structural and spin glass models (see for instance [11] and [12]), in the 
presence of similar "all-neighbours" interactions. In the current paper, we work out 
the pairwise BP scheme for the palette-colouring problem. As readers more familiar 
with these methods may have noticed, this scheme is not a generalised BP [13]. Indeed, 
in the literature, the latter term usually denotes a class of algorithms computing the 
minima of more refined free energy approximations (Kikuchi [T3], rather than Bethe, 
free energies) [15]. Here, however, we derive an algorithm computing the correct Bethe 
approximation, which is, the exact solution for loopless graphs. We then compare the 
performance of the new BP algorithm (which we shall simply call BP from now on) to 
the naive one, showing that further improvements can be obtained. 

Let us note that the correct Bethe approximation has already been considered for 
this problem by Wong and Saad [3], in order to investigate the emergence and nature 
of the satisfiable to unsatisfiable transition, observed upon decreasing the mean vertex 
degree of different sparse random graphs. In the replica-symmetry assumption, the 
authors of the cited paper study average macroscopic properties of a given random 
graph ensemble, making use of a numerical method of the population dynamics type. 
In the current work, we mainly focus on the algorithmic properties of the message 
passing procedure, and related decimation strategies. In particular, we discuss both 
analytical and numerical strategies for limiting the increase of computational cost 
arising from the pairwise messages. Also, in the last part of the paper, we develop the 
distributional version of the message-passing scheme, which in the literature is usually 
denoted as cavity method ^6] and used to study random (glass-like) systems [T7] . 
We limit this analysis to the replica-symmetry assumption and to the simple colour- 
symmetric (paramagnetic) solution. Within these simplifying hypotheses, we compute 
the quenched entropy for given random graph ensembles, and estimate the corresponding 
satisfiability threshold, partially recovering a result of [3]. 

2. Statement of the problem and Belief Propagation 

We consider an undirected simple graph, whose vertices are denoted by z = 1, . . . , A^. 
Our goal is to assign to each vertex i a colour Xi from a given colour set C = {1, 2, . . . , g}. 
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in a way to minimise the cost (energy) function 

N 

E{xi,...,XN) = ^'n{xi,Xdi), (1) 

i=l 

where di denotes the neighbourhood of i (i.e., the set of vertices directly connected to 
i by an edge), and xgi = {xj}jQdi the array of colour variables in di. The elementary 
energy term T]{xi,Xgi) counts the number of missing colours in the neighbourhood of i, 
including i itself. A suitable expression for the r] function is therefore 

n 

r]{xu. . .,Xn) = ^Yl[l - S{xi,x)], (2) 

x£C i=l 

where 6{x, y) is a Kronecker delta, and n is the number of entries of the r] function (not 
a-priori fixed). With the above definitions, the cost function value is E{xi, . . . , xn) = 
if and only if the colour assignments Xi, . . . , xat satisfy all constraints. 

In the current work, we deal with this problem by studying an equivalent 
"thermodynamic" system, whose potential energy is defined by the cost function 
E{xi, . . . ,X]sf). For energy minimisation, we consider the zero temperature limit. 
The BP approach allows us to determine approximate marginals of the equilibrium 
(Boltzmann) probability distribution for the colour variables. As mentioned in the 
Introduction, our approximation becomes exact when the graph is a tree. From the 



treatment described in [Appendix A[ it turns out that we can write two different 
marginals, namely, the joint distribution of two colour variables on a graph edge 
Pij{xi,Xj), and the joint distribution of a given colour variable together with its 
neighbours Pi,di{xi,xsi) ("cluster" distribution), as a function of pairwise messages 
mj^i{xj,Xi). Each given term mj_^i{xj,Xi) may be viewed as a message sent from the 
cluster {j,dj} to the edge {i,j}, representing the influence of the constraint associated 
to the vertex j onto the colour variables of the edge {i,j} (some details about this 



interpretation are elucidated in Appendix B ). In formulae, we have 

PiM^u Xdi) = e^'-^''('^-'^S') Yl ^j^ii^j^ a;^), (4) 

j&di 

where (3 is the inverse temperature, and fij and /j, usually called free energy shifts (see 



Appendix A), can be determined by normalisation as 



Q ^ ?77.j_^j (Xj , ) ?7ij__^j(Xj , Xj) , (5) 

g-/. = ^ e-''''(^-^'«) JJmj-^i(xj-,Xi). (6) 

Xi,XQi jedi 

The messages have to satisfy a set of self-consistency equations, which basically account 
for compatibility between "overlapping" distributions. For instance, the {i,j} edge 
distribution must be a marginal of the cluster distributions associated to both vertices 
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i and j. Considering the former case, we can write 

where the sum runs over the values of the array of colour variables XQi\j = {xk}kedi\jy 
i.e., the colour variables in the neighbourhood of i except xj. In fact, we can obtain the 
self-consistency equation by replacing ([3]) and (jl]) into the compatibility equation ([7]), 
yielding 

mi^j{xi,Xj) cc^e-^'^^''''''^'^ JJ mk^i{xk,Xi), (8) 

^di\j k€di\j 

where a normalisation factor has been replaced by the proportionality symbol. In order 
to satisfy all the necessary compatibilities, one equation of the above form must hold for 
each directed edge i ^ j. The BP algorithm solves the set of self-consistency equations 
iteratively, starting from suitable (usually random or uniform) initial conditions for 
the messages, until the distance between messages at subsequent updates goes below a 
given threshold. From a heuristic point of view, each message update according to ([8]) 
is usually interpreted as a propagation process, so that in the following we shall also 



denote ([8]) as the propagation equation. For completeness, in Appendix B we also report 
the propagation equations of the naive BP algorithm, which are numerically simpler. 

We note that, by employing the explicit expression ([2]) of the elementary energy 
term (cluster energy), we can significantly reduce the computational cost of the 
propagation equation ([8]) as well. Indeed, it turns out that the latter can be rewritten 
as 

mi^j{xi,Xj) (X ^ (-l + e^^)!-^! Y\_ ^ mk^i{xk,Xi), (9) 

BCC\xi\xj k&di\j Xk<^C\B 

where the outer sum runs over all the possible subsets B of the colour set C without 
the colours Xi,Xj. The derivation can be found in Appendix C Now, we compare 



the computational cost of the generic equations with respect to the simplified form. 
Assuming that d is the degree of vertex i, the generic equation ([8]) requires {d — l)q'^~^ 
multiplications, which can be reduced to 2g°'~^-|-^^~2 5'°'"" by suitable (straightforward) 
programming tricks. Taking into account that a trivial necessary condition for an 
elementary constraint to be satisfiable is d > q—1, the leading term of the computational 
cost turns out to be at least g''"^. The simphfied equation ([9]), however, requires 
[d — 1)2*^"^ multiplications, which is clearly much more convenient for any q > 2. 

Finally (for completeness and future use), we also report the simplified expression 
of the cluster free energy shift ([6]) 

e-^' = E E (-l + e"')""n E ^M^J^^^)^ (10) 

XiGC BCC\xi jedixjeC\B 

which can be obtained by an analogous derivation. 
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3. Optimisation strategy and numerical results 

In this section, we define the optimisation strategy, and test its performance on single 
instances of random graphs drawn from a suitable ensemble. Our strategy involves a 
decimation procedure, which is analogous to that of [2J, but is carried out on the basis of 
nearest-neighbour pair distributions pij{xi, xj), rather than single- variable distributions. 
Given a graph and a number q of available colours, we first fix the colour of a randomly 
chosen vertex, in order to break the colour permutation symmetry, and proceed as 
follows. We perform the first BP run (starting from uniform messages) and determine 
the pair distributions according to ([3]). For each edge {i,j}, we fix the colour variables 
Xi, Xj at the values Xj, Xj having the largest joint probability, provided the latter is larger 
than a certain threshold. If no probability satisfies such a condition, we only fix the 
pair of variables with the largest joint probability over the whole graph. Then, we rerun 
BP (starting from the previously computed messages) and iterate the above procedure 
until all variables are fixed, or all constraints are satisfied (in the latter case, non-fixed 
variables can be assigned a random colour). We always set the threshold probability 
at 0.9, as done in [2]. We observe that, in most cases, one of the two variables chosen 
to be fixed has been already fixed at a previous stage of the decimation procedure, so 
that, in most cases, we actually fix just one variable for each given pair. Therefore, even 
though we are working with pair, rather than single-variable distributions, we observe 
that choosing the same threshold probability results in a similar decimation rate. 

We now spend a few words on the precise meaning of "fixing a variable", as 
introduced above, from the point of view of the message-passing procedure. In the 
thermodynamic language, colouring a vertex is tantamount to imposing an infinite 
energy penalty to all other possible colours. Thus, if we want to fix a single variable Xi 
to a given colour Xj, we may add to the corresponding cluster energy ri{xi,XQi) a term 
7[1 — 6{xi,Xi)], and then take the limit 7 -> oo. By the propagation equation ([H]), it is 
easy to see that such operations imply that all the messages mi^j{xi, Xj), sent from the 
vertex i (more precisely, from the cluster associated to the vertex i), must be multiplied 
by a pref actor 6{xi,Xi), which basically preserves only messages of the type mi^j{xi, Xj). 
As a consequence, when we fix the colours of two nearby vertices, it turns out that the 
latter no longer need to exchange messages or, in other words, the messages remain 
fixed at 



Although such messages have no effect on the vertices i and j themselves, due to the 
form of the propagation equation, they may still influence their neighbourhoods di \ j 
and dj \i. 

Before presenting the results, we note that in [2] the authors observe that 
the naive BP hardly ever converges. This problem is circumvented by computing 
probability distributions as "time-averages" over a number of iterations, which turns 
out to provide sufficient information for guiding the decimation procedure. In our 
scheme, the BP algorithm turns out to converge more frequently, except in the vicinity 
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of the satisfiabihty threshold (especially after several vertices have been coloured). 
Convergence may be improved by computing the message updates as convex linear 
combinations between the old estimates (with coefficient a) and the updates obtained 
from the propagation equation (with coefficient 1—a). The adjustable parameter a plays 
the role of a damping in the propagation dynamics, and we refer to it as the damping 
parameter. Nevertheless, we generally find that reaching convergence is not really 
necessary. Indeed, a very small number u of sequential update^ of all messages are 
sufficient to provide the relevant information about pair probabilities, and that a larger 
number of iterations does not significantly improve the overall algorithm performance. 
This fact allows us to drastically reduce the computational cost of the full procedure, 
although it does not affect the complexity of a single iteration. 

We are now in a position to perform a quantitative comparison with the naive BP 
approach [2] . As in the cited work, we consider a number of available colours g = 4 and 
random graphs with N = 1000 vertices. Graphs are generated in such a way to have 
vertices with two different degrees d = [cj and d = [c] , where c is the mean degree. 
The degree distribution, i.e., the probability of a vertex having degree d, is therefore 



which we denote as linear distribution. We always assume c > g — 1, in order to avoid 
the appearance of vertices with degree less than q — 1, for which the local constraints are 
necessarily unsatisfiable. We do not report results about graphs with cut-Poissonian 
degree distribution |2], which exhibit analogous behaviour. 

In figured] we report both perfect colouring and unsatisf action measures, over 1000 
random graph samples, as a function of the mean degree. The perfect colouring measure 
is simply defined as the fraction of samples for which the algorithm has been able to 
find a colour assignment satisfying all constraints. The unsatisfaction measure counts 
the fraction of missing colours per vertex, i.e. the energy per vertex divided by the 
total number of colours, E{xi, . . . ,XN)/Nq {xi,...,xn being the colour assignments 
found by the algorithm), averaged over all samples. We can see that the BP approach 
improves the naive one in both respects. The perfect colouring measure turns out to be 
consistently increased in the vicinity of the critical mean degree values, below which it 
rapidly vanishes. In this region, naive BP itself was already found to work better than 
the Walksat-like algorithm, analysed in [2]. 

In analogy with the ordinary colouring problem [18] (though with reversed role 
for the mean degree c), we expect that, for even lower c values, our problem 
becomes unsatisfiable with high probability (i.e., with probability tending to 1 in the 
"thermodynamic" N ^ oo limit). We also expect the presence of an intermediate 

^ With reference to the propagation equation ([5]), by sequential update we mean that, in generating a 
given "output" (left-hand side) message, one makes use of updated "input" (right-hand side) messages, 
if aheady available. 




otherwise 



iid= [cJ 
ifd= \c] 



(12) 
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Figure 1. Perfect eolouring (left) and unsatisfaction (right) measures over 1000 
graphs for naive BP [2 (open squares) and BP with v — i and a = 0.1 (solid squares), 
as a function of the mean degree. In both cases the inverse temperature used for the 
computation is /? = 10. 



hard-satisfiable phase in which the problem is satisfiable with high probabihty but 
BP fails, because of a clustered structure of the solution space (replica-symmetry 
breaking) [161 113 [HI [El 120] • Accordingly, the perfect colouring probability falling 
down to zero is likely to indicate the onset of such hard-satisfiable phase rather than 
the truly unsatisfiable phase. We shall return to this point later. For the moment, we 
observe that the BP approach definitely works better than the naive one, even for very 
low c values, in the (expected) unsatisfiable phase. In this region we observe both a 
reduction of the unsatisfaction measure itself and of its growth rate with decreasing c. 

Concerning the percentage of perfect colouring, we have noticed that the 
performance of the algorithm is significantly affected by the number v of iterations 
per decimation step, only in a narrow region close to the critical c value. This suggests 
that in this region the problem is actually more difficult to solve. Some results about 
the influence of the v parameter are reported in figure [21 Upon increasing i/, some 
improvement can also be observed in the unsatisfaction measure. However, as previously 
mentioned, increasing v values beyond 2 or 3 does not yield any further significant 
improvement. We also note that a quantitatively comparable improvement of the 
unsatisfaction measure is obtained by choosing a small but nonzero value of the damping 
parameter a. All the results reported in the current paper have been obtained with 
a = 0.1, but it turns out that in a rather large range (0.03 < « < 0.3) the average 
algorithm performance is practically independent of the precise value of the damping 
parameter. Finally, we note that (for u > 2) the perfect colouring measure exhibits a 
slight kink at c = 4.0. This can be ascribed to an abrupt change in the structure of the 
graph ensemble. In fact, according to the linear degree distribution f[T2l) . for c = 4 all 
vertices have exactly degree 4, whereas, for c>4orc<4a few vertices appear with 
degree 5 or 3, respectively. 
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Figure 2. Perfect eolouring (left) and unsatisfaction (right) measures over 1000 
graphs for BP with a = 0.1 and /3 — 10, as a function of the mean degree. Squares, 
circles, triangles denote v — 1,2,3, respectively. In the main figures, interpolation 
between data-points in the transition region has been performed by taking into account 
the extra data-points reported in the insets. 
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Figure 3. Perfect colouring (left) and unsatisfaction (right) measures over 1000 
graphs for BP with i' — 2, a = 0.1, and (3 — 10, as a function of the mean degree. 
Squares, circles, triangles denote number of vertices N = 1000, 2000, 4000, respectively. 



We have also analysed the algorithm behaviour as a function of the number of 
vertices N. The results are reported in figure |3l We can see that the transition in the 
perfect colouring probability becomes more and more abrupt upon increasing A^, and 
a cross-over point appears at a mean degree value c ~ 3.943. Even though the precise 
cross-over value may depend on the algorithm parameters, such behaviour suggests that 
the transition may be sharp (first-order- like) in the N oo limit. The latter conjecture 
is consistent with the fact that random graphs of increasing size become more and more 
tree-like, such that the BP approach is able to provide better and better approximations. 
In principle, the cross-over point might be the signature of the satisfiable to unsatisfiable 
transition, but, as previously mentioned, we are rather led to identify it with the onset 
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of the hard-satisfiable phase. Indeed, the estimate of the satisfiabihty threshold, carried 
out in the next section, provides further evidence in favour of the latter hypothesis. 

4. Entropy and satisfiability threshold 

In this section, we study average macroscopic properties of the BP solution over 
random graph ensembles, with particular attention to the average entropy. The latter is 
usually denoted as quenched entropy in statistical mechanics language. Taking the limit 
/3 — )■ oo, this quantity provides an average measure of (the logarithm of) the number of 
zero energy configurations, i.e., perfect colourings, for a given ensemble, which also 
allows us to estimate the satisfiability threshold. In this context, the main source 
of approximation will be the replica-symmetry assumption, since the approximation 
due to BP itself is expected to be negligible in the infinite size limit. Furthermore, 
we limit the analysis to BP solutions that do not break the colour permutation 
symmetry ("paramagnetic" solutions), because we have numerical evidence that, when 
BP converges, no spontaneous symmetry breaking of the solution is ever observed. 
Average properties of non-paramagnetic (glass-like) solutions have been investigated 
in [3], but they do only appear at very low c values, where the replica-symmetry 
assumption is expected to break down anyway. 

According to the paramagnetic ansatz, the messages are always such that mi^j{x, x) 
does not depend on x, and mi^j{x, y) does not depend on x, y,if x ^ y. This means that 
the only important quantity is Ui^j = mi^j{x,x)/mi^j{x,y), i.e., the ratio between the 
"equal colours" message and the "different colours" message. Taking into account that 
the message normalisation is irrelevant to all observable quantities, we can write the full 
message as 



We note that in principle one could also think about the inverse ratio 
mi^j{x,y)/mi^j{x,x) as the relevant message, but this choice turns out to be un- 
feasible, due to the nature of the constraints, favouring the presence of different neigh- 
bouring colours. Indeed, at zero temperature, it is easy to foresee the emergence of 
"hard" messages such that mi^j{x,x) = 0, stemming from vertices with degree q — I 
(whose all neighbours are forced to have a different colour), whereas we always expect 
mi^j{x, y) ^ for X ^ y in a paramagnetic state. 

Replacing f|T3|) into the inner sum appearing in the simplified propagation 
equation ([9]), we can write 



Xk<^C\B 

where the term — 1 + u^^i appears because Xi ^ B. Since the sum above only depends 
on B via its cardinality \B\, in ([9]) we can replace the sum over 5 by a sum over 




if X = y 
otherwise 



(13) 




(14) 
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cardinalities, inserting suitable binomial coefficients. Thus we finally obtain a reduced 
propagation equation for the message ratios: 



q-1 



9-1 

\ n , 

n=0 ^ ^ kedi\j 



kedi\j , , 

u.^, = _ , (15) 

in which we have also taken the zero temperature (/3 — )■ oo) limit. The cluster free energy 
shift can be similarly derived by replacing (fT3|) into ( ITO|) . In the zero temperature limit, 
we obtain 

e-^" = ~ ^) (-1)" 11(9 - ^ - 1 + «,^.)- (16) 

n=0 ^ ^ jrGSi 

The edge free energy shifts can be directly obtained by inserting (IT^ into 

Q-f-J = g (g _ 1 + Ui^j Uj^i). (17) 

We can characterise a random graph ensemble by a probability distribution of 
messages P{u). Such a distribution has to obey a functional equation (usually known 
as cavity equation [16]) of the following form 

^(^) = ^Pd 1 dMiP(ni) . . . / <\ud^iP{ud-i)5{u - u(mi, . . . , Mrf_i)), (18) 

where u{ui, . . . , Ud-i) is the "propagation function" defined by (fT^ . and where is the 
probability of finding a vertex of degree d by choosing a random direction in a randomly 
selected edge. It is easy to see that pd is related to the degree distribution as 

Pd= • 19 

c 

In the context of the cavity method, the replica-symmetry assumption consists in the 
fact that we consider a single distribution of messages. In a replica-symmetry breaking 
scenario, each propagated quantity Ui^j (message) would be replaced by a probability 
distribution defined over different ergodic components (states) |16] . 

We solve the functional equation f|T8|) numerically by a population dynamics 
approach [16]. In a nutshell, we represent the distribution P{u) by an evolving 
population of messages. An elementary evolution step consists in generating a new 
message according to the propagation equation ( !T5|l . making use oi d — 1 messages 
randomly taken from the population, where d is randomly generated according to the 
Pd distribution. The newly generated message replaces a randomly selected message of 
the population. Due to the presence of hard messages u = generated by degree q — I 
vertices, we observe that the message distribution P{u) contains a Dirac delta peak 
centred in zero with weight Pq^i- 
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From the message distribution, we can evaluate the average cluster and edge free energy 
shifts as: 

fc = ^Pd / duiP{ui) ... j dudP{ud)fc{ui, . . .,Ud), (20) 

J= jdUiPiUi) jdU2P{u2)feiUi,U2), (21) 

where the functions /c(ui, . . . , Ud) and /c(mi, U2) are defined by (fT6l) and (|T71) . Thus we 
obtain the average free energy per vertex as 

7=ic-^^ (22) 

where c/2 is the average number of edges per vertex. The above formula directly 
descends from ( ]A.12[) . Finally, since we have incorporated a (3 factor in our free energy 
definition, and since the limit /3 — t- 00 fixes the energy at zero, the entropy per vertex is 
simply s = —f. 

For actual calculations, we have considered random graph ensembles with the linear 
degree distribution (as defined in the previous section), and with the cut-Poissonian 
distribution (also considered in [2]), defined as 

r (c-g + l)^ iid>Q-l 

Pd=l {d-q + l)\ (23) 

otherwise , 

where c is still the mean degree. This distribution also excludes vertices with degree 
smaller than q — 1 and, hence, trivially unsatisfiable constraints. 

In figures H] and [5] we report the results for the two ensembles respectively. As 
expected, the entropy turns out to be a monotonically increasing function of the mean 
degree, as the problem becomes easier to satisfy. It is interesting to note that the fraction 
of hard messages Pg_i for the linear degree distribution turns out to be nonzero only 
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Figure 5. The same as figure 2] for random graphs with cut-Poissonian degree 
distribution. 

for c < q, which explains the kink observed in the entropy function. Negative entropy 
identifies the unsatisfiable region (perfect colourings are exponentially rare), whereas 
the zero entropy point identifies the satisfiability threshold Cth- For the two ensembles, 
we respectively find Cth ~ 3.825 and Cth ~ 4.082. As previously mentioned, we expect 
that these values are in fact approximate ones, because we have neglected the possibility 
of replica-symmetry breaking. Nevertheless, these values are in reasonable agreement 
with the numerical estimates put forward in [2], namely Cth ~ 3.8 and Cth ~ 4.1. As 
far as the linear ensemble is concerned, we expect that our result is also analytically 
equivalent to the (replica-symmetric) one by Wong and Saad [3J, and in fact we obtain 
a very good numerical agreement for the threshold value. 

5. Summary and conclusions 

In this paper, we have considered a variation of the well-known graph colouring problem, 
which may be viewed as the prototype of a combinatorial optimisation problem emerging 
in the context of distributed data storage. We have worked out the BP equations 
for this problem, which provide the exact solution on a tree. Due to the many- 
body nature of the problem, such equations turn out to be different from the naive 
BP message-passing scheme, as the latter involves messages sent to single variables, 
whereas the former involve messages sent to pairs of nearest neighbour variables. Our 
simulations, performed on random graphs drawn from a suitable ensemble, suggest 
that the new algorithm, associated with a decimation procedure, turns out to be much 
more effective than the naive BP-based algorithm. In particular, the probability of 
finding a perfect colouring is significantly enhanced, especially in the vicinity of the 
satisfiable-to-unsatisfiable transition. Furthermore, both the unsatisfaction measure 
and its growth rate upon decreasing the average graph connectivity are significantly 
reduced. This improved performance is, however, obtained at the cost of increased 
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computational complexity. Therefore, we have suggested two possible ways of reducing 
this complexity. On the one hand, we have shown some analytical manipulations 
(exploiting the particular form of the constraints) can simplify a single iteration. On 
the other hand, numerical experiments have shown that very few iterations (even a 
single one) provide sufficient information to drive the decimation procedure. We note 
that, in this way, our decimation procedure turns out to be somehow "distributed" 
over different BP iterations. This fact partially reminds us of the so-called "reinforced 
BP" approach, which has been successfully exploited to solve different combinatorial 
optimisation problems [21]. Although beyond the scope of the current paper, it might 
be interesting to analyse the performance of the latter method for the palette-colouring 
problem. Indeed, the reinforcement strategy would replace the decimation procedure, 
allowing for a fully decentralised implementation of the algorithm, which might make it 
actually appealing from a practical/technical point of view. 

From a more theoretical perspective, we have applied the cavity method to investigate 
the satisfiable-to-unsatisfiable transition, which appears upon decreasing the average 
graph connectivity. Limiting this analysis to the replica-symmetry assumption, we have 
observed that the threshold connectivity seems to be significantly displaced with respect 
to the value observed in the numerical experiments. As previously mentioned, this fact 
suggests that the breakdown of the algorithm may occur because of the onset of a hard- 
satisfiable phase. It would also be interesting to investigate this possibility, making use 
of the cavity method at the level of 1-step replica symmetry breaking [1^, along the 
lines of several works dealing with the ordinary colouring problem fi8\ [T9l [20] . 

Appendix A. Belief Propagation equations 

The BP equations can in general be derived from a very simple recipe. One first "fakes" 
that the graph is a tree and then formally applies the equations obtained for such a case 
to a generic graph. This derivation also provides a heuristic argument explaining why 
the method generally works better for graphs with a tree-like structure. 

According to the Boltzmann law, the joint probability distribution of all the colour 
variables can be written as 



where E{xi, . . . ,xn) is the energy function ([T|), (3 is the inverse temperature, and F is 
the free energy (times P), which can be determined by normalisation. Following our 
"fake assumption", we can consider, for each edge ^ — )■ j (defined with a direction), the 
branch growing from the root vertex j towards i, disconnected from the remainder of 
the system (see figure IXTj) . We can thus define a partial energy function Ei^j{xi^j), 
obtained by summing the elementary interaction energies only for vertices in the branch, 
except the root vertex. Since our elementary interaction energies couple together clusters 
of variables including each vertex and all its neighbours, each partial energy function 
depends on the array of all colour variables in the branch including the root vertex. We 




(A.l) 
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Figure Al. Tree graph (left), disconnected branclr i — > j (centre), and decomposition 
of the latter into subbranches fc — i, for k G di \ j, plus the elementary cluster 
associated to i (right). 



denote this array by Xi^j. Now, each disconnected branch can be ideally studied as an 
independent subsystem, whose Boltzmann probability distribution turns out to be 



(A.2) 



where Fi^j denotes the corresponding free energy. Note that it is possible to decompose 
the partial energy of the given branch i — > j into a sum of the partial energies of its 
subbranches k ^ i, for all k & di \j, plus the elementary interaction energy associated 
to i (see figure lAll) : 

Ei^j{xi^j) = r]{xi,xai) + ^ Ek^i{xk-^i). (A.3) 

kGdi\j 

We also define a free energy shift fi^j as the difference between the free energy of 
the i ^ j disconnected branch and the sum of free energies of its (disconnected) 
subbranches, i.e., 

k(idi\j 

From (IA.2p . flA.3p . and f!A.4p . we can write 



-I3r}{xi,xgi) 



(A.5) 



k&di\j 

which provides a relationship between the Boltzmann distribution of the i ^ j 
disconnected branch and those of its (disconnected) subbranches. Defining the messages 
mi^j{xi,Xj) as marginals of a corresponding branch distribution pi^j{xi^j) over the 
variables Xj and Xi (respectively, the root vertex and its first neighbour in the branch) 
we finally obtain the self-consistency equation ([8]). 

We still have to show how messages can determine cluster and edge marginals of the 
full Boltzmann distribution (lA.ip . As in our previous manipulations, we observe that, 
for each given vertex i, it is possible to write the total energy function ([T]) as a sum of 
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partial energies of the disconnected branches j ^ i, for all j G di, plus the elementary 
interaction energy associated to i: 

E{xi, . . . , xn) = ri{xi, Xdi) + ^ Ej^i{xj^i). (A. 6) 

Defining also the free energy shift /, as the difference between the total free energy 
F and the sum of the disconnected branch free energies, for all the possible branches 
growing from vertex i, i.e., 

F = /, + ^F,^„ (A.7) 

jedi 

from dAH), ([Oj) . fEOj) . and (jAj]), we easily obtain 

Now, the cluster distribution Pi^di{xi, xgi) for each vertex i can be derived as a suitable 
marginal of p{xi, . . . ,xn)- By this marginalisation, we obtain (jl]). As far as edge 
marginals are concerned, we have to consider a different decomposition of the total 
energy function. Namely, for each edge {i,j}, the former can be written as a sum of 
two contributions from respectively the branch starting from j towards i and the one 
starting from i towards j: 

E{xi, Xn) = Ei^j{xi^j) + Ej^i{xj^i). (A. 9) 

We define the free energy shift fij as the difference between the total free energy F and 
the sum of the free energies of the disconnected branches mentioned above, i.e., 

F = fij + Fi^, + F,^i. (A. 10) 

From (lA^), flCT) . and ^KJO\f . we obtain 

p{xi, ...,xn)= e^''Pi-,j{xi^j)pj^i{xj^i). (A.U) 

Evaluating the edge distribution pij{xi,Xj) as a marginal of p{xi, . . . ,xi\f), we obtain 
©. 

Finally, we determine the total free energy as a function of the free energy shifts. 
First we sum both sides of ( 1A.7I) over all vertices i, and both sides of ( lA.lOl) over all 
edges {i,j}- Then we subtract the latter equation from the former. It is easy to see 
that, on a tree, the number of vertices equals the number of edges plus one, such that 
the left-hand side of the resulting equation turns out to be exactly F. Furthermore, in 
the right-hand side all the branch free energies cancel out, and we obtain 

N 

where ^{jj} denotes the sum over all edges. 
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^4 




Figure Bl. A simple undirected graph (left), and the related factor graphs giving 
rise to naive BP (centre) and BP (right). Open circles and squares denote variable 
and function nodes, respectively. The labels are explained in the text. 



Appendix B. Factor graph formalism 

In this appendix, we first introduce a more general form of BP equations, defined on 
a factor graph [22]. Then, we show that from this form one can derive both the naive 
BP equations of [2] and the BP equations of the current paper by two different factor 
graphs associated to the same problem. 

A factor graph is a bipartite graph, whose left- and right-side vertices are usually 
referred to as variable nodes and function nodes. The notion of factor graph is meant 
to describe the structure of the energy function, whose independent variables (i.e., the 
configuration variables of the corresponding thermodynamic system) are associated to 
the variable nodes. A function node connected to a number of variable nodes represents 
an elementary interaction among the corresponding variables. Let V denote the set of 
all the variable nodes, such that each node v & V is associated with a configuration 
variable Let also A C V denote any subset (cluster) of variable nodes, and let 
xa = {xv}v^A denote the array of the associated configuration variables. We can thus 
write the energy function as 

Eixv) = 5^eA(xA), (B.l) 

where caIxa) denotes the elementary interaction energy among the variables in the 
cluster A (cluster energy), whereas the sum runs over the set J-" of all the interacting 
clusters. In what follows, the same label A denotes both a function node and the cluster 
of variable nodes connected to it. An example of factor graphs describing the energy 
function of a palette-colouring problem is sketched in figure | 



When the factor graph is a tree, an argument similar to that in [Appendix A| allows 
one to write marginals of the Boltzmann distribution as follows: 
- For each variable node v & V we have the marginal: 



Pv{xy) = e^" Y\ niA^vixy), (B.2) 

A3v 

where the product runs over all the clusters A to which v belongs (i.e. all the function 
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nodes connected to v), mA^y{xy) is a function-to-variable message, and f^ is a free 

energy shift (ensuring normalisation). 

- For each cluster A & J-", we have the marginal: 

Pa{xa) = e^^-^^^(^^) H w,^a{x,), (B.3) 

where f^ is a free energy shift, and where Wv^a{xv) is a variable-to-function message: 
w^_,Aixy) = Yl fnA'-:,vix,,), (B.4) 

A'eT\A 
A'Bv 

a product of the messages sent to v from all connected function nodes except A. 
As shown in Section [2], one can derive the propagation equations by imposing 
compatibility between overlapping distributions. In this case, for all A G J-" and for 
all V & A, we can write 

Pvixy) = ^Pa{xa), (B.5) 

where the sum runs over all possible values of the variables in the cluster A except Xy. 
Inserting (lB.2p . (IB. 31) into (IB.Sp . we obtain the propagation equation 

m^^,(x,)oc5^e-^^^(^-^) H w,,^Aix,>), (B.6) 
xa\v v'eA\v 

with the Wvi^a{xvi) defined by ( lB.4p . Note that, as in ([H]), we have replaced the 
normalisation factor with a proportionality symbol. Finally, following the argument 
of Appendix A , we write the total free energy as a function of the free energy shifts as 

AeT vev 

where is the degree of the variable node v in the factor graph. 



Naive BP 

We first consider the energy function ([1]), where the configuration (colour) variables Xi 
are associated with the vertices i = 1, . . . , iV of an ordinary graph, and the elementary 
interaction energy involves a cluster made up of a vertex i and all its neighbours di. 
This structure is described by a factor graph in which the variable nodes are associated 
with the vertices of the original graph and the function nodes with the clusters. We can 
use the same index for both the variable node i and the function node with i at its centre 
(the cluster Ai = {i, di}). Hence, each variable node i receives messages mA^^i{xi) from 
all the function nodes Aj with j G di, and from Ai itself. With the short-hand mj^i for 
rriAj^i, omitting the normalisation factor, (IB. 21) becomes 

Pi{xi) oc mi^i{xi) J]^ mj^i{xi). (B.8) 
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Similarly, a function node Ai receives variable-to-function messages from i and all j G di, 
and the cluster distribution for fIB.SP becomes 

PiM^i^ ^di) 0^ e-'^''('=-'=e»)w,^i(x,) Yl yjj^iixj). (B.9) 

We have identified eAii^A,) with r]{xi,Xdi), and is short-hand for wj^a,- From 

(IB.4p . one can see that the variable-to-function messages take two slightly different 
forms, depending on whether they travel (to the cluster A^) either from the "central" 
node i or from a "peripheral" node j G di. In the simplified notation, we have 
respectively 

ti'i^j(xj) = JJ mj^i(xi), (B.IO) 
Wj^i{xj) = mj^j{xj) Yl rnk-,jixj). (B.U) 

k£dj\i 

The compatibility condition (IB.Sp . can also be written in two different forms. For all 
i = 1, . . . , N , j E di, we have respectively: 

Pi{xi) = ^Pi,diixi,xai), (B.12) 
Pi{xi)= ^ Pj,dj{xj,xaj). (B.13) 
Using ( IB.Sp and (IB.Qp . this in turn gives rise to two different propagation equations: 

XQi jedi 

ruj^iixi) oc e-^'^^''^''''''^Wj^j{xj) JJ Wk-,j{xk). (B.15) 

These equations, together with (IB.lOp and (IB. lip , are identical (apart from the notation) 
to the naive BP equations presented in p] . From figure IBll one sees that even when the 
original graph is a tree, the corresponding factor graph contains short loops, and the 
naive BP equations are not exact. 



Current BP 

We now consider an alternative form of the energy function ([T]) by introducing: 

(i) a variable for each vertex-neighbour pair {i,j G di) (a kind of "replica" of Xj); 

(ii) a constraint imposing that all replicas of Xi are equal for each vertex i. 

The constraints can be realised by assigning infinite energy penalties to configurations 
we want to be forbidden. Assuming 7 — )■ oo, we define 

N 

Ei{x{}jedi, {x^N}jedN) = H^i^ {^ibeSi) + xi{xi}jedi)] , (B.16) 

1=1 

where the function x(-) returns 1 when its entries are not all equal, and otherwise, 
whereas x* means that the replica index is irrelevant. Note that the allowed (finite 
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energy) configurations can be directly mapped onto the configurations of the original 
system, and also have the same Boltzmann weights. This does not depend on the specific 
form of the cluster energy function ?7(-), but only on the fact that each vertex of the 
original graph interacts (at most) with all its neighbours. With these definitions, each 
edge {i,j} of the original graph can be naturally associated with the pair of variables 
{xl,x'j} (the j-replica of Xi and the i- replica of Xj). Moreover, the structure of the 
modified energy function flB.161) is described by a factor graph in which the variable 
nodes v correspond to the edges {i,j} of the original graph, while the function nodes 
A now correspond to the clusters of interacting edges Ai = {{i,j}\j G di}. Figure [BT] 
shows that, when the original graph is a tree, this factor graph is also one, and every 
variable node {i,j} has degree 2, so that it only receives messages from the function 
nodes Ai and Aj. Using nii^j as short-hand for mAi^{i,j}, ( 1B.2I) becomes 

P{ij}{xl,xi) = e^('-^> mi^j{xl,xi) mj_,i(a;}, x^). (B.17) 
The variable-to-function messages (]B.4p are simply 

Wi^j{xl,x]) = mi^j{xl,x'j), (B.18) 
where Wi^j is short-hand for w^ij^^A^- Finally, the cluster distribution (IB. 31) is 

PA,({a:i,x}},ea.) = e^^--^''(^*'HbeaO-/37x(KbeaO "Q ^^^^(xj,^!), (B.19) 

j&di 

where the cluster energy eA{xA) has been replaced with the elementary term of ( IB. 161) . 
Discarding forbidden configurations (dropping replica indices), (]B.17p is equivalent 
to ([3]), and, since all the x-terms vanish, (IB. 191) is equivalent to (jlj). This is sufficient to 
derive the propagation equation ([8]), as shown in Section[2l Finally, the free energy flB.7p 
is equivalent to ( ]A.12p . as all variable nodes of the factor graph have degree 2. 

Appendix C. Simplified equations 

In this appendix, we derive the simplified forms ([9]) and (|TOl) of the propagation 
equation ([8]) and the free energy shift IQ, respectively. Both derivations are based 
on similar manipulations. We consider the elementary energy term ([2]) associated to 
vertex i, and note that it can be written in an alternative form for each given choice of 
a neighbour vertex j G di: 

r]{xi,Xdi)= ^ JJ [1 - (5(xfc,x)], (C.l) 

xGC\xi\xj k£di\j 

where the sum runs over the colour set C, excluding the colours Xi and xj (if Xi = xj, 
just one colour is excluded). Since the product in the equation above can only take the 
values and 1, we can write the corresponding Boltzmann factor as 

Y[ {l-(l-e-^) n (C.2) 

x&C\xi\xj k&di\j 
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and expand the outer product 

^-Mx.,xe.) = J2 (-l + e-'^)l^l JJ JJ [1 (C.3) 

BCC\xi\xj xeB k€di\j 

where the sum runs over all the possible subsets B of the colour set C\xi\xj. Then, we 
exchange the two products, expand the product over x (taking into account that every 
product of two or more deltas vanishes), and use the fact that 

J2H^k,x) = l. (C.4) 

x£C 

We finally obtain 

^-Mx.,xa,)^ (_l + e-'')l^l JJ ^ (C.5) 

BCC\xi\xj k€di\j x€C\B 

The propagation equation ([H]) for a given vertex i generates an outgoing message 
mi^j{xi, Xj) as a function of the set of incoming messages mk^i{xk, Xi) (where k G di\j). 
Replacing the final expression for the Boltzmann factor ( 1C.5|) into this equation, we 
readily obtain the simplified propagation equation (|9]). 

As far as the free energy shift is concerned, we rewrite the elementary energy 
term ([2]) in yet another form, namely, 

r]{xi,Xdi) = ^ - (5(xj,x)]. (C.6) 

xGC\xi jedi 

In this case the sum runs over the colour set C, excluding only the colour Xi. A totally 
analogous derivation allows us to write 

g-/3^{x,,xa.) = J2 (-l + e-'^)!^! JJ J2 ^(^J'^)' (C-7) 

BCC\xi j&dix&C\B 

which, plugged into (E]), yields ( fTOl) . 
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